Rounding Effects in Record Statistics 
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We analyze record-breaking events in time series ol continuous random variables that are subse- 
quently discretized by rounding to integer multiples of a discretization scale A > 0. Rounding leads 
to ties of an existing record, thereby reducing the number of new records. For an infinite number of 
random variables that are drawn from distributions with a finite upper limit, the number of discrete 
records is finite, while for distributions with a thinner than exponential upper tail, fewer discrete 
records arise compared to continuous variables. In the latter case the record sequence becomes 
highly regular at long times. 
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The statistics of record-breaking events have been widely 
studied in many contexts, including sports evolution- 
ary biology Q , the theory of spin glasses Q , and the pos- 
sible role of global warming in the occurrence of record- 
breaking temperatures [4j-|9|]. Records are defined as the 
entries in a time series of measurements that exceed all 
previous values. While the record statistics of indepen- 
dent, identically distributed (iid) random variables (RVs) 
that are drawn from continuous distributions are well 
understood [loL [r"0 ] , the understanding of records drawn 
from time-dependent distributions 12-3] and from se- 
ries of correlated RVs [H [HI is still developing. 

Here we address discreteness effects on record statis- 
tics. Conventionally, records are recorded from variables 
that are drawn from a continuous distribution. However, 
in all practical applications, technical limitations cause 
observations to be discrete, even if the underlying distri- 
bution is continuous. In sports or meteorology, distance, 
time, temperature, or precipitation measurements are al- 
ways rounded to a certain accuracy [j], @, 0] , resulting in 
an effective discrete distribution of RVs. Thus ties of ex- 
isting records can arise, which alters the probability for 
a record to occur in any given observation (Fig. [T]). 

For RVs that are explicitly drawn from discrete dis- 
tributions, the effect of ties strongly affects the number 



of records [17H21j . For related (5-records and geometric 
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FIG . 1: (color online) Effect of rounding down records with 
discretization unit A. Inverted triangles indicate records, 
with those that survive after rounding shown solid. The 
dashed line shows the evolution of the rounded record value. 



records, where a new record arises only if the current ob- 
servation exceeds the current record by a fixed constant 
(5 [21, 22 j or by a fixed fraction 23], intriguing statisti- 
cal properties of records were found for the three uni- 
versality classes of extreme value statistics (EVS) [3]. 
However, the consequences of measuring rounded record 
values that are drawn from continuous underlying distri- 
butions appears not to have been studied previously. 

We consider a set of RVs Xi, ...Xjv and focus on the 
probability P n = Prob(A"„ > X\, . . , , X n _i) that the n th 
variable in this series is a record. We denote P n as the 
record rate and R n = X^fc=i Pk as the record number. For 
continuous iid RVs, the universal result is P n = ^ (see, 
e-g 
0.577 
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Thus for n» 1, R n ss Inn + 7, with 7 a 
the Euler constant. We assume that, the RVs Xi 
are discretized in units of a minimal scale A. That is, 
each Xi gets rounded to a value of A" A = fcA. We may 
consider (i) rounding down, with fc = [Xj/AJ and \_X\ 
the floor function, which gives the largest integer smaller 
than X, or (ii) rounding to the nearest lattice point, with 
k = [Xi/A + A/2J. Because asymptotic results do not 
depend on the rounding protocol, we will discuss only 
rounding down. We define the strong record rate 



P A = Prob(X A >X A ...,X A _i): 



(1) 



in which ties caused by the discretization are not counted 
as new records. Thus not only X n , but also the rounded 
value JT A has to be larger than all previous RVs for a 
new record to occur (Fig. [1]). 

General theory, asymptotic results. For iid RVs 

Xi drawn from a distribution with probability density 
f(x) and cumulative distribution F{x) — J x dy f{y), the 
record rate is obtained from P n = J dx f(x)F n ~ 1 (x) 
For any continuous density f(x), this integral gives the 
universal behavior mentioned above, P n = — . However, 
if the measurement Xi is rounded down to X A , the inte- 
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gral for P n breaks into the sum 
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(fe+l)A 



dxf(x) 



kA 



= ^[Fdk+l^-FikA^F^ikA). (2) 
fe 

This gives the strong record rate from continuous RVs 
that are rounded down to the closest integer multiple of 
A. We emphasize that in the practically more relevant 
case where record values are rounded either up or down to 
the closest integer multiple of A, the record rate has the 
same statistical properties as those from only rounding 
down. We now give asymptotic results for P A for the 
three basic classes of EVS 24]: Weibull (distributions 
with a finite upper limit), Gumbel (unbounded upper 
tail decaying faster than any power law), and Frechet 
(power- law upper tail). Our asymptotic approximations 
for the discrete record rate P A for these classes of EVS 
agree well with numerical results. 

Weibull class: For illustration, we start with the uni- 
form distribution: f(x) — 1 for x G [0, 1] and other- 
wise. For discretization scale A = 4, with integer- valued 
L > 1, Eq. flU reduces to: 



P„ A =^A(fcA)"- 1 = A' 



Hi 



(3) 



fc=i 



where H m n is the m th harmonic number of power n. 
At some point in the time series of RVs, a record with 
a rounded value 1 — A occurs; this is necessarily the 
last record. For a fine discretization scale, A <C 1, 
the sum in ([3]) can be replaced by an integral to give 
P£ « i (1 - A)". Thus for any A > 0, P A no longer 
decays as -j, but instead approaches zero exponentially 
with n — rounding strongly depresses the asymptotic 
record rate for the uniform distribution. 

A more general example of the Weibull EVS class is 
f{x) = f(l - x)*' 1 , with £ > and x € [0,1]. By 
expanding Eq. ([2]) to second order for A <C 1, we find 
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x [1 - (1-jfcA)* 
1 [l-raAS- ^ '•<•> J 
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i exp(-nA«), 



^F(2- 



riA^ < 1, 
nAt > 1. 



(4) 



Since the underlying distribution has a bounded support, 
the total number of records is again finite. The results in 
(|4]) reproduce those found for the uniform distribution. 

Gumbel class: As a basic example, we treat the expo- 
nential distribution f{x) — e~ x . For n » 1 we replace 




FIG. 2: (color online) Scaled record rate nP„ for n = 1000 for 
the Gaussian, exponential, and Pareto (with fi — 1.2) distri- 
butions. Without rounding, P n — -jj-. Simulations (symbols) 
are averaged over 10 6 time series and over 975 < n < 1025 to 
smooth the data. Analytical predictions (curves) are shown 
for comparison. For the origin of the peaks for the Gaussian 
and exponential distributions, see the text following Eq. (|14|l . 



the sum in Eq. by an integral and find 
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for arbitrary A > 0, in agreement with findings for the 
geometric distribution in Ref. [l8| and with our simu- 
lations (Fig. EJ. For A <C 1, © reduces to P* w 
i (l - §), while for A > 1, P^ ~ In contrast 

to the Weibull class, P^ asymptotically decays as — for 
arbitrary A. 

For the Gaussian distribution f(x) 
unit standard deviation, we find that as n 



= e x I 2 with 

> oo 
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For n — > oo we evaluate this integral by the Laplace 
method by expanding the integrand about x* — 
ln(n 2 /27r), where x* is the mean value of the n th record. 
After some calculation, we obtain 
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Thus the record rate decays slightly faster than i 
(Fig. EJ. Correspondingly, i? A oc A _1 (lnn) 1 / 2 , which 
diverges weakly as n — > oo. 

Frechet class: A representative for this class is the 
Pareto distribution f(x) = fix^^^ 1 , with x> 1 and fi>0. 
Using again Eq. @, the asymptotic record rate P A is 
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In contrast to the two previous classes, the effect of the 
rounding is negligible, as P„ — > P n for n —> oo and 
arbitrary A (Fig. [2]). 

Small- A regime. We now focus on the effects of round- 
ing when the discretization scale is small (A <C 1) for 
fixed n. Here we find a useful analogy between the effect 
of a linear drift in RVs [l3| and the effect of rounding, 
and we adapt methods developed for the former prob- 
lem to help elucidate rounding effects. For small A the 
general expression ([2]) for simplifies to 



(fe+l)A 



dxf(x) 



A A 



F n ~\kA), 



dxf(x)F n - 1 ([x\ A ) 



dx(x-[x\ A )f 2 (x)F n - 2 (x). (9) 



Here [x\ a is defined as the largest integer multiple of A 
that is smaller than x. Thus, in the second line, kA = 
[xJa for fcA < x < (k + 1) A, which obviates writing 
the sum. In the last step, we expand to first order in 
the quantity x — [x\ a and employ the crude assumption 



that, on average, x — \x\ a « — to give 

« £ (* " f » 2 2n) , 



(10) 



where I„ = J ' dx f 2 (x)F n ~ 2 (x). The approximation un- 
derlying (fT0|) is valid if n 2 AX n <C 1. The quantity I n 
appears in record statistics that arise from continuous 
RVs with a linear drift [13], whose behavior is known for 
a wide range of distributions. In the following we use the 
results from [l3j to determine P„ in the small-A regime. 

Weibull and Frechet classes: For the distribution 
f(x) — £(1 — a;)^ -1 introduced above, the approxima- 
tion given by Eq. (fTU|) is useful for £ > 1 and we find, for 
nA« < 1, 
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FIG. 3: (color online) Simulations of for Gaussian RVs 



nan < i. 



Thin curves are i 



for 



in the regime 

A = i \ and | and n £ [0, 100]. For each A, 10 6 time series 
were simulated. The thick dashed curve depicts the analytical 
prediction Eq. 1)150 . Inset shows the same analysis for A = | 
with n € [1, 1000]. 



and A< 1, Eq. (1120 reproduces the numerical simulation 
values for P^ very accurately (Fig. [3]). 

Large-A regime. For Gumbel-class distributions that 
decay at least exponentially fast near the upper limit, 
we can provide an alternative description for the record 
number R^. For these distributions, it is known that the 
average spacings between the record events do not in- 

Therefore, we may choose 



crease in time for large n [11 



a sufficiently large value of A that almost all records are 
suppressed because of ties. It then follows that all dis- 
crete values kA (with k > 0) will eventually be record 
values and is just the sum over the probabilities that 
a record has already occurred for a certain value kA. 
The corresponding probabilities Ii n (k) for record value 
fcA are given by n„(fc) w 1 — F(kA) n ~ 1 , which leads to 
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which, for nA^ <C 1 and £ > 1, agrees with the result 
derived from our general approach in Eq. Similarly, 
for the Pareto distribution we recover Eq. (JSJ. 

Gumbel class: For the exponential distribution, we find 
P^ w — (l— which agrees with the small-A behavior 
of Eq. ([5]). For the Gaussian distribution, the small-A 
approximation allows us to obtain a new expression for 
the record rate when v / hin <C A -1 , 
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The regime Vlnn <g; A -1 is not accessible through the 
general approach and this range is particularly important 
for applications, such as in climatology 0- For n 1 



(13) 



k=0 



k=l 



For elementary Gumbel distributions, interesting prop- 
erties emerge from n„(fc). For a small n and large kA, 
it is obvious that H n (k) ~ 0. Conversely, for large n and 
arbitrary kA eventually H n (k) ss 1, since F(kA) < 1 for 
finite A;A. 

We now estimate the regime where n ra (fc) switches be- 
tween and 1; this condition also determines the point 
where the mean record number switches from k — 1 to 
k. Since H n (k) will never be exactly or 1, we seek the 
time n, where H n (k) is either smaller than e (n = n_) or 
larger than 1 — e (n — n + ) for small e<Cl. By elementary 
means we find 
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FIG. 4: (color online) Record number for Gaussian RVs 
for A = 1, 2, 4. Data (bold lines) are based on 100 realizations 
with a maximal n = 10 60 . For n > 10 6 we used an algorithm 
that directly simulates record events by sampling both the 
distribution and the waiting time of the (k + l) st record from 
the value of the k th record. Thin lines show the asymptotic 
behavior predicted by Eq. (|15[) . The vertical lines show the 
steps predicted by n ^ y^fc Ae (fcA) /2 . 



Evidently Tl n (k) switches between and 1 when n is 
between n_ and n+, where n_ and n+ are both pro- 
portional to [In (F(fcA))] -1 . For the exponential dis- 
tribution, for example, we find that n_ = e e fcA and 
n + = In (1/e) e fcA , so the fc th record will occur at a time 
proportional to e feA , leading to a mean record number 
of P A rj Inn. In the large kA regime, records occur 
in an ordered fashion and are well separated from each 
other. The (k + l) st record occurs at time e( fc+1 ) A , which 
for A 3> 1, is much later than the time of the k th record. 
Thus the mean record number undergoes a step-like pe- 
riodicity when plotted against e™. For the Gaussian dis- 
tribution, the same approach now predicts that IT n (/c) 



switches for 



27rfcA 



= fe 2 A 2 /2 



(Fig. |4} . For large A; A 



and large n, the mean record number becomes 



power-law distributions, the effect of rounding becomes 
negligible for n — >• oo and P A — > i independent of A. 
In the intermediate Gumbel class, the behavior is more 
subtle. For the exponential distribution, P A decays as 
~ with a A-dependent prefactor, while for the general 
distribution f(x) oc exp(— Ix] 13 ) with /? > 1, the record 
rate decays as n _1 In (n) 1 ^ 1 . 

For underlying distributions that decay at least expo- 
nentially, the record sequence becomes ordered at long 
times, in marked contrast to independent record events 
from continuous iid RVs [HI U|- While correlations be- 
tween record events have been previously observed for 
RVs that are drawn from drifting [14J or broadening 12 1 
distributions, the effect of rounding is much stronger and 
renders record events predictable on a time scale that 
grows exponentially (or faster) with record number. 

To illustrate that rounding effects have an observation- 
ally significant influence on records, we analyzed 50 years 
of daily temperatures from 361 U.S. weather stations 25 1 
along the lines of Q- The measurements were reported 
in integer units of A = 1°F and we considered all 361 
x 365 time series for the individual calendar days with 
an average standard deviation of a m 8.9°F. Only 75% 
of the weak upper (ties allowed) and 78% of the weak 
lower records were also strong records (no ties), in good 
agreement with the value of 79% predicted by our ana- 
lytical result in Eq. (fl2|) . In this example the effect of 
ties on the record rate has a similar magnitude as that 
of the small warming trend in the data (cf. 043)- Thus 
rounding effects should be carefully accounted for if one 
wishes to use record statistics to detect secular trends in 
data, such as global warming. 
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(15) 



fe=0 



which was already obtained with the general approach 
above and confirms the validity of the form for P A given 
in Eq. (fT3|) . The step periodicity in P A is the source of 
the observed peaks (Fig. [2]) in the record rate P A as a 
function of A for exponential and Gaussian distributions. 

Conclusions. We determined how rounding down con- 
tinuous random variables affects the statistics of records. 
Our results directly apply to the practical situation where 
continuous variables are rounded either up or down to the 
closest integer multiple of a fixed discretization scale A. 

For distributions with bounded support, rounding 
leads to an exponential decay of the record rate, P A , and 
an asymptotically finite record number. In contrast, for 
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